# Instruction Following

Glm 4 9b Chat Abliterated GGUF
Other
A 9B-parameter chat model based on GLM-4 architecture, supporting Chinese and English dialogues, quantized for various hardware environments
Large Language Model Supports Multiple Languages
G
bartowski
2,676
11
Shisa V2 Llama3.3 70b
Shisa V2 is a bilingual (Japanese/English) general-purpose chat model series trained by Shisa.AI, optimized based on Llama-3.3-70B-Instruct, focusing on improving Japanese task performance while maintaining English capabilities.
Large Language Model Transformers Supports Multiple Languages
S
shisa-ai
144
2
Gemma 3 Starshine 12B Alt
A creative writing model based on fine-tuned merges of Gemma 3 12B IT and PT, optimized for role-playing with improved character differentiation and reduced identity confusion.
Large Language Model Transformers
G
ToastyPigeon
143
8
Llama 3.1 8b DodoWild V2.01
An 8B-parameter language model based on the Llama 3.1 architecture, created by merging multiple models with mergekit, capable of text generation
Large Language Model Transformers
L
Nexesenex
58
2
Glm Edge 1.5b Chat
Other
GLM-Edge-1.5B-Chat is a 1.5 billion parameter-scale chat model based on the GLM architecture, suitable for Chinese dialogue scenarios.
Large Language Model Safetensors
G
THUDM
891
17
Llava NeXT Video 7B DPO Hf
LLaVA-NeXT-Video is an open-source multimodal chatbot optimized through mixed training on video and image data, possessing excellent video understanding capabilities.
Video-to-Text Transformers English
L
llava-hf
12.61k
9
Llava NeXT Video 7B Hf
LLaVA-NeXT-Video is an open-source multimodal chatbot that achieves excellent video understanding capabilities through mixed training on video and image data, reaching SOTA level among open-source models on the VideoMME benchmark.
Text-to-Video Transformers English
L
llava-hf
65.95k
88
Llava Llama 3 8b
Other
A large multimodal model trained based on the LLaVA-v1.5 framework, using the 8-billion-parameter Meta-Llama-3-8B-Instruct as the language backbone and equipped with a CLIP-based visual encoder.
Image-to-Text Transformers
L
Intel
387
14
Llava Meta Llama 3 8B Instruct
A multimodal model integrating Meta-Llama-3-8B-Instruct and LLaVA-v1.5, providing advanced vision-language understanding capabilities
Image-to-Text Transformers
L
MBZUAI
20
11
Llava NeXT Video 7B DPO
LLaVA-Next-Video is an open-source multimodal dialogue model, fine-tuned with multimodal instruction-following data on large language models, supporting video and text multimodal interactions.
Text-to-Video Transformers
L
lmms-lab
8,049
27
Llava NeXT Video 7B
LLaVA-Next-Video is an open-source multimodal dialogue robot, fine-tuned from a large language model, supporting multimodal interaction with video and text.
Text-to-Video Transformers
L
lmms-lab
1,146
46
Llava Gemma 7b
LLaVA-Gemma-7b is a large multimodal model trained based on the LLaVA-v1.5 framework, using google/gemma-7b-it as the language backbone combined with a CLIP visual encoder, suitable for multimodal understanding and generation tasks.
Image-to-Text Transformers English
L
Intel
161
11
Llava Gemma 2b
LLaVA-Gemma-2b is a large multimodal model trained based on the LLaVA-v1.5 framework, utilizing the 2-billion-parameter Gemma-2b-it as the language backbone combined with a CLIP visual encoder.
Image-to-Text Transformers English
L
Intel
1,503
44
Llava V1.6 Vicuna 13b Gguf
Apache-2.0
LLaVA is an open-source multimodal chatbot based on the Transformer architecture, offering various model versions that balance size and quality through quantization techniques.
Image-to-Text
L
cjpais
630
9
Llava V1.6 Vicuna 7b Gguf
Apache-2.0
LLaVA is an open-source multimodal chatbot trained by fine-tuning LLM on multimodal instruction-following data. This version is the GGUF quantized version, offering multiple quantization options.
Text-to-Image
L
cjpais
493
5
Llava V1.5 7b Gguf
LLaVA is an open-source multimodal chatbot, fine-tuned on LLaMA/Vicuna and trained with GPT-generated multimodal instruction-following data.
Image-to-Text
L
granddad
13
0
Llava 1.6 Mistral 7b Gguf
Apache-2.0
LLaVA is an open-source multimodal chatbot, trained by fine-tuning LLM on multimodal instruction-following data. This version is the GGUF quantized version, offering multiple quantization options.
Text-to-Image
L
cjpais
9,652
106
Llava V1.6 Vicuna 13b
LLaVA is an open-source multimodal chatbot, fine-tuned on large language models with multimodal instruction-following data.
Image-to-Text Transformers
L
liuhaotian
7,080
56
Llava V1.6 Mistral 7b
Apache-2.0
LLaVA is an open-source multimodal chatbot, trained by fine-tuning large language models on multimodal instruction-following data.
Text-to-Image Transformers
L
liuhaotian
27.45k
236
Openchat 3.5 0106 GGUF
Apache-2.0
Openchat 3.5 0106 is an open-source dialogue model based on the Mistral architecture, developed by the OpenChat team. This model focuses on delivering high-quality conversational experiences and supports various dialogue tasks.
Large Language Model
O
TheBloke
4,268
72
Liuhaotian Llava V1.5 13b GGUF
LLaVA is an open-source multimodal chatbot, based on the LLaMA/Vicuna architecture, fine-tuned with multimodal instruction-following data.
Text-to-Image
L
PsiPi
1,225
36
Rose 20B
Experimental stitched model, created by merging Thorn-13B and Norn Maid-13B, excels in role-playing scenarios
Large Language Model Transformers English
R
tavtav
114
40
Openchat 3.5 GPTQ
Apache-2.0
OpenChat 3.5 7B is a 7B-parameter large language model based on the Mistral architecture, developed by the OpenChat team and released under the Apache 2.0 license.
Large Language Model Transformers
O
TheBloke
107
17
Llava V1.5 13b Lora
LLaVA is an open-source multimodal chatbot, fine-tuned from LLaMA/Vicuna and trained on GPT-generated multimodal instruction-following data.
Text-to-Image Transformers
L
liuhaotian
143
26
Llava V1.5 7b Lora
LLaVA is an open-source multimodal chatbot, fine-tuned on GPT-generated multimodal instruction data based on the LLaMA/Vicuna model.
Text-to-Image Transformers
L
liuhaotian
413
23
Llava V1.5 13B AWQ
LLaVA is an open-source multimodal chatbot, fine-tuned on GPT-generated multimodal instruction-following data based on LLaMA/Vicuna.
Text-to-Image Transformers
L
TheBloke
141
35
Longalpaca 70B
LongLoRA is an efficient fine-tuning technique for large language models with long context processing capabilities, achieving this through shifted short attention mechanisms, supporting context lengths from 8k to 100k.
Large Language Model Transformers
L
Yukang
1,293
21
Llava V1.5 13b
LLaVA is an open-source multimodal chatbot, fine-tuned based on LLaMA/Vicuna and integrated with visual capabilities, supporting interactions with both images and text.
Text-to-Image Transformers
L
liuhaotian
98.17k
499
Mentallama Chat 13B
MIT
The first open-source large language model with instruction-following capability for explainable mental health analysis
Large Language Model Transformers English
M
klyang
326
17
Mentallama Chat 7B
MIT
The first open-source large language model with instruction-following capability for explainable mental health analysis, fine-tuned based on LLaMA2-chat-7B
Large Language Model Transformers English
M
klyang
3,336
23
Synthia 70B V1.2b
SynthIA (Synthetic Intelligence Agent) is an LLama-2-70B model trained on an Orca-style dataset, excelling in instruction following and long dialogues.
Large Language Model Transformers English
S
migtissera
136
29
Mythalion 13B GGUF
Mythalion 13B is a 13B-parameter large language model developed by PygmalionAI, based on the Llama architecture, specializing in text generation and instruction-following tasks.
Large Language Model English
M
TheBloke
2,609
67
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase